SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech

نویسندگان

Wei Chu

Abeer Alwan

چکیده

A novel Statistical Approach for F0 Estimation, SAFE, is proposed to improve the accuracy of F0 tracking under both clean and additive noise conditions. Prominent Signal-to-Noise Ratio (SNR) peaks in speech spectra are robust information source from which F0 can be inferred. A probabilistic framework is proposed to model the effect of additive noise on voiced speech spectra. It is observed that prominent SNR peaks located in the low frequency band are important to F0 estimation, and prominent SNR peaks in the middle and high frequency bands are also useful supplemental information to F0 estimation under noisy conditions, especially babble noise condition. Experiments show that the SAFE algorithm has the lowest Gross Pitch Errors (GPE) compared to prevailing F0 trackers: Get F0, Praat, TEMPO, and YIN, in white and babble noise conditions at low SNRs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On a robust F0 estimation of speech based on IRAPT using robust TV-CAR analysis

Fundamental frequency (F0) estimation is important in speech processing such as speech coding, synthesis, recognition and so on. A present F0 estimation method performs well under clean condition, however the performance deteriorates significantly in noisy environment. As a result, robust F0 estimation against additive noise is demanded. We have previously proposed F0 estimation methods based o...

متن کامل

A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency

This paper proposes a robust and accurate F0 estimation method for noisy speech. This method uses two different principles: (1) an F0 estimation based on periodicity and harmonicity of instantaneous amplitude for a robust estimation in noisy environments, and (2) an F0 estimation based on stability of instantaneous frequency as an accurate estimation method. The proposed method also uses a comb...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Improving YANGsaf F0 Estimator with Adaptive Kalman Filter

We present improvements to the refinement stage of YANGsaf[1] (Yet ANother Glottal source analysis framework), a recently published F0 estimation algorithm by Kawahara et al., for noisy/breathy speech signals. The baseline system, based on time-warping and weighted average of multi-band instantaneous frequency estimates, is still sensitive to additive noise when none of the harmonic provide rel...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech

نویسندگان

چکیده

منابع مشابه

On a robust F0 estimation of speech based on IRAPT using robust TV-CAR analysis

A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Improving YANGsaf F0 Estimator with Adaptive Kalman Filter

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

عنوان ژورنال:

اشتراک گذاری